Educational Data Mining and Learning Analytics: Applications to Constructionist Research

نویسندگان

  • Matthew Berland
  • Ryan Shaun Joazeiro de Baker
  • Paulo Blikstein
چکیده

Constructionism can be a powerful framework for teaching complex content to novices. At the core of constructionism is the suggestion that by enabling learners to build creative artifacts that require complex content to function, those learners will have opportunities to learn this content in contextualized, personally-meaningful ways. In this paper, we investigate the relevance of a set of approaches broadly called “educational data mining” or “learning analytics” (henceforth, EDM) to help provide a basis for quantitative research on constructionist learning which does not abandon the richness seen as essential by many researchers in that paradigm. We suggest that EDM may have the potential to support research that is meaningful and useful both to researchers working actively in the constructionist tradition but also to wider communities. Finally, we explore potential collaborations between researchers in the EDM and constructionist traditions; such collaborations have the potential to enhance the ability of constructionist researchers to make rich inference about learning and learners, while providing EDM researchers with many interesting new research questions and challenges. In recent years, project-based, student-centered approaches to education have gained prominence, due in part to an increased demand for higher-level skills in the job market (Levi and Murname, 2004), positive research findings on the effectiveness of such approaches (Barron, Pearson, et al., 2008), and a broader acceptance in public policy circles, as shown, for example, by the Next Generation Science Standards (NGSS Lead States, 2013). While several approaches for this type of learning exist, Constructionism is one of the most popular and well-developed ones (Papert, 1980). In this paper, we investigate the relevance of a set of approaches called “educational data mining” or “learning analytics” (henceforth abbreviated as ‘EDM’) (R. Baker & Yacef, 2009; Romero & Ventura, 2010a; R. Baker & Siemens, in press) to help provide a basis for quantitative research on constructionist learning which does not abandon the richness seen as essential by many researchers in that paradigm. As such, EDM may have the potential to support research that is meaningful and useful both to researchers working actively in the constructionist tradition and to the wider community of learning scientists and policymakers. EDM, broadly, is a set of methods that apply data mining and machine learning techniques such as prediction, classification, and discovery of latent structural regularities to rich, voluminous, and idiosyncratic educational data, potentially similar to those data generated by many constructionist learning environments which allows students to explore and build their own artifacts, computer programs, and media pieces. As such, we identify four axes in which EDM methods may be helpful for constructionist research: 1. EDM methods do not require constructionists to abandon deep qualitative analysis for simplistic summative or confirmatory quantitative analysis; 2. EDM methods can generate different and complementary new analyses to support qualitative research; 3. By enabling precise formative assessments of complex constructs, EDM methods can support an increase in methodological rigor and replicability; 4. EDM can be used to present comprehensible and actionable data to learners and teachers in situ. In order to investigate those axes, we start by describing our perspective on compatibilities and incompatibilities between constructionism and EDM. At the core of constructionism is the suggestion that by enabling learners to build creative artifacts that require complex content to function, those learners will have opportunities to learn that complex content in connected, meaningful ways. Constructionist projects often emphasize making those artifacts (and often data) public, socially relevant, and personally meaningful to learners, and encourage working in social spaces such that learners engage each other to accelerate the learning process. diSessa and Cobb (2004) argue that constructionism serves a framework for action, as it describes its own praxis (i.e., how it matches theory to practice). The learning theory supporting constructionism is classically constructivist, combining concepts from Piaget and Vygotsky (Fosnot, 2005). As constructionism matures as a constructivist framework for action and expands in scale, constructionist projects are becoming both more complex (Reynolds & Caperton, 2011), more scalable (Resnick, Maloney, et al., 2009), and more affordable for schools following significant development in low cost “construction” technologies such as robotics and 3D printers. As such, there have been increasing opportunities to learn more about how students learn in constructionist contexts, advancing the science of learning. These discoveries will have the potential to improve the quality of all constructivist learning experiences. For example, Wilensky and Reisman (2006) have shown how constructionist modeling and simulation can make science learning more accessible, Resnick (1998) has shown how constructionism can reframe programming as art at scale, Buechley & Eisenberg (2008) have used e-textiles to engage female students in robotics, Eisenberg (2011) and Blikstein (2013, 2014) use constructionist digital fabrication to successfully teach programming, engineering, and electronics in a novel, integrated way. The findings of these research and design projects have the potential to be useful to a wide external community of teachers, researchers, practitioners, and other stakeholders. However, connecting findings from the constructionist tradition to the goals of policymakers can be challenging, due to the historical differences in methodology and values between these communities. The resources needed to study such interventions at scale are considerable, given the need to carefully document, code, and analyze each student’s work processes and artifacts. The designs of constructionist research often result in findings that do not map to what researchers, outside interests, and policymakers are expecting, in contrast to conventional controlled studies, which are designed to (more conclusively) answer a limited set of sharply targeted research questions. Due the lack of a common ground to discuss benefits and scalability of constructionist and project-based designs, these designs have been too frequently sidelined to niche institutions such as private schools, museums, or atypical public schools. To understand what the role EDM methods can play in constructionist research, we must frame what we mean by constructionist research more precisely. We follow Papert and Harel (1991) in their situating of constructionism, but they do not constrain the term to one formal definition. The definition is further complicated by the fact that constructionism has many overlaps with other research and design traditions, such as constructivism and socio-constructivism themselves, as well as project-based pedagogies and inquiry-based designs. However, we believe that it is possible to define the subset of constructionism amenable to EDM, a focus we adopt in this article for brevity. In this paper, we focus on the constructionist literature dealing with students learning to construct understandings by constructing (physical or virtual) artifacts, where the students' learning environments are designed and constrained such that building artifacts in/with that environment is designed to help students construct their own understandings. In other words, we are focusing on creative work done in computational environments designed to foster creative and transformational learning, such as NetLogo (Wilensky, 1999), Scratch (Resnick, Maloney, et al., 2009), or LEGO Mindstorms. This sub-category of constructionism can and does generate considerable formative and summative data. It also has the benefit of having a history of success in the classroom. From Papert’s seminal (1972) work through today, constructionist learning has been shown to promote the development of deep understanding of relatively complex content, with many examples ranging from mathematics (Harel, 1990; Wilensky, 1996) to history (Zahn, Krauskopf, Hesse, & Pea, 2010). However, constructionist learning environments, ideas, and findings have yet to reach the majority of classrooms and have had incomplete influence in the broader education research community. There are several potential reasons for this. One of them may be a lack of demonstration that findings are generalizable across populations and across specific content. Another reason is that constructionist activities are seen to be timeconsuming for teachers (Warschauer & Matuchniak, 2010), though, in practice, it has been shown that supporting understanding through project-based work could actually save time (Fosnot, 2005) and enable classroom dynamics that may streamline class preparation (e.g., peer teaching or peer feedback). A last reason is that constructionists almost universally value more deep understanding of scientific principles than facts or procedural skills even in contexts (e.g., many classrooms) in which memorization of facts and procedural skills is the target to be evaluated (Abelson & diSessa, 1986; Papert & Harel, 1991). Therefore, much of what is learned in constructionist environments does not directly translate to test scores or other established metrics. Constructionist research can be useful and convincing to audiences that do not yet take full advantage of the scientific findings of this community, but it requires careful consideration of framing and evidence to reach them. Educational data mining methods pose the potential to both enhance constructionist research, and to support constructionist researchers in communicating their findings in a fashion that other researchers consider valid. Blikstein (2011, p. 110) made the argument that “one of the difficulties is that current assessment instruments are based on products [...], and not on processes, due to the intrinsic difficulties in capturing detailed process data for large numbers of students. [...] However, new data collection, sensing, and data mining technologies [...] are enabling researchers to have an unprecedented insight into the minute-by-minute development of several activities.” By enabling scalable and precise assessments of more complex constructs than can be typically assessed through traditional assessment instruments (such as multiple-choice tests), EDM methods support an increase in methodological rigor and replicability, while maintaining much (though not all) of the richness of qualitative methods. EDM methods do not require constructionists to abandon qualitative and meaningful evaluation for simplistic multiple-choice tests; instead, EDM can add some of the benefits of quantitative work to rich qualitative understanding. Furthermore, EDM has the possibility to generate new understandings of how students learn in constructionist learning environments and how to adapt our environments to those new understandings. Importantly, EDM provides a powerful set of methods that can be used to present actionable data to learners and teachers, by which we can give learners the tools to help themselves and use their own data. Though this paper, we will examine that potential in terms of current work in EDM and constructionism, potential research overlaps, and open questions generated by bringing them together. Grading and Assessment The limitations of traditional tests and assessments are well-known (E. L. Baker et al., 2010), but those tests remain standard in most schooling, due to the ease of administration and the perceived need for assessment of student success and teacher quality. Regarding alternative forms of assessments for constructionist learning, Papert (1980) suggested detailed peer critiques (or crits) in an art class or actual use of a student’s tool in an authentic setting can provide meaningful feedback. This is undoubtedly true, but the feedback received in these formats are not very precise and well-defined, and take much longer than other forms of automated feedback (e.g., feedback of a compiler about bugs in the code). There is no reason why broader assessments such as crits cannot live alongside more fine-grained assessments such as compiler feedback or the types of process assessments that EDM can generate. However, EDM can support continual and real-time assessment on student process and progress, in which the amount of formative feedback is radically increased. This allows for faster progress overall (Black & Wiliam, 1998; Shute, 2008), more opportunity for teacher insight into students’ learning (Roschelle, Penuel, Yarnall, Shechtman, & Tatar, 2005), and can provide a more constructive basis for continual assessment. This is important, as teachers frequently feel challenged in using constructionist tools in public school settings as districts frequently mandate a minimum number of grades per week. This may then unnecessarily impede teachers’ incorporation of constructionist practices as they may find it very difficult to grade a large-scale project 2-3 times per week as an artifact, unless the design process is broken down into artificially small subcomponents. Anecdotally, when instructing practicing teachers in constructionist practices, the first author has heard complaints from teachers that they are required to give at least two grades per week per assignment, even in projects spanning weeks or months; the teachers found such assessments to be difficult for projects that required exploration and creativity. Unfortunately, these rules are often a reality in contemporary classrooms, and they can hinder good project-based learning and teaching (Blumenfeld, Fishman, Krajcik, Marx, & Soloway, 2000). Fortunately, educational data mining can serve to support teachers in supporting such learning, which owing to professed reasons of practicality is often found only in more affluent schools (Warschauer & Matuchniak, 2010), by providing access to more data to support students’ progress monitoring and teachers’ continual assessment of progress. This is by no means a concrete solution to the problem of overly aggressive assessment, but it may provide the teachers concrete resources to argue against or (at least) nominally comply with the policy. What Educational Data Mining Can Bring to the Table Some of these goals for increasing the rigor of constructionist research and providing more valid assessment may be achieved by integrating methods from the emerging discipline of educational data mining and learning analytics (EDM). EDM has become a useful method for research in other educational paradigms, with the potential to offer both richness and rigor. EDM has been defined as “an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in” (IEDMS, 2009). EDM typically consists of research to take educational data and apply data mining techniques such as prediction (including classification), discovery of latent structure (such as clustering and q-matrix discovery), relationship mining (such as association rule mining and sequential pattern mining), and discovery with models to understand learning and learner individual differences and choices better (see R. Baker & Yacef, 2009, Romero & Ventura, 2010a, and R. Baker & Siemens, in press, for reviews of these methods in education). Prediction modeling algorithms automatically search through a space of candidate models to find the model which best infers a single predicted variable from some combination of other variables. These models are developed on some set of data, typically validated for their ability to make accurate predictions for new students, but ideally also for new content (cf. R. Baker, Corbett, Roll, & Koedinger, 2008) – and new populations of students (cf. Ocumpaugh et al., accepted). As such, developing a prediction model depends on knowing what the predicted variable is for a small set of data; a model is then created for this small set of data, and validated so that it can be applied at greater scale. For instance, one may collect data on whether 140 students demonstrated a scientific inquiry strategy while learning, develop a prediction model to infer whether the inquiry behavior occurred, validate it on sub-sets of the 140 students that were not included when creating the prediction model, and then use the model to make predictions about new students (e.g. Sao Pedro et al., 2010, 2012). As such, prediction models can be used to analyze the development of a student strategy or behavior in a fine-grained fashion, over longitudinal data or many students, in an unobtrusive and non-disruptive way. This allows much (though not all) of the richness of qualitative analysis, while being much more feasible to conduct at scale than qualitative analysis is. As such, it may prove useful for constructionist research, but relatively little work has been done in creating predictive models of creative constructionist learning environments. To date, it has been largely used to model student strategies (Amershi & Conati, 2009; Sao Pedro et al., 2010, 2012), student behaviors associated with disengagement (R. Baker et al., 2008), student emotions (Dragon et al., 2008; D’Mello et al., 2010, Worsley & Blikstein, 2011), longer-term student learning (R. Baker, Gowda, & Corbett, 2011), and participation in future learning (e.g. dropout) (Arnold, 2010). Other EDM methods accomplish different goals, but have the same virtue of enabling analysis of student behavior and learning at scale but in a richer fashion than traditional quantitative methods. For example, cluster analysis finds the structure that emerges naturally from data, allowing researchers to search for patterns in student behavior that commonly occur in data, but which did not initially occur to the researcher. Relationship mining methods (such as sequential pattern mining) find sequences of learner behavior that manifest over time and are seen repeatedly or in many students. In all cases, once a model or finding obtained via data mining is validated to generalize across students and/or contexts, it can be applied at scale and used in discovery with models analyses that leverage models at scale to infer the relationship between (for instance) student behaviors and learning outcomes, or student strategies and evidence on student engagement. While EDM research has been conducted on a range of different types of educational data, a large proportion of EDM research has involved more restrictive (or, at least, less creative) online learning environments. Early research in EDM often involved very structured learning environments, such as intelligent tutoring systems (cf. R. Baker, Corbett, and Koedinger, 2004; Beck and Woolf, 2000; Merceron & Yacef, 2004). Data from these structured learning environments was a useful place to start research in EDM, as the structure of the learning environment makes it easier to infer structure in the data. For example, these environments privilege clearly defined ‘skills’ that map onto student responses, each which will be clearly and a priori identified as correct or incorrect. That focus makes it easier to accomplish acceptable-quality inference of those defined skills, a task which can be a significant challenge in other types of learning environments. For this reason, data from structured learning environments remains a considerable part of the research literature in EDM. However, in recent years, EDM research has increasingly involved open-ended online learning environments. In the first issue of the Journal of Educational Data Mining, Amershi and Conati (2009) published an analysis of the strategic behaviors employed by successful and unsuccessful learners in a fully exploratory online learning environment, using cluster analysis to discover patterns in student behavior. In their environment, students explore the workings of a range of common search and other AI algorithms. Amershi and Conati discovered that ‘less successful’ learners are less likely to pause and self-explain during execution of an algorithm, and after completing algorithm execution. Less successful learners were also less likely to break down domain spaces into subspaces. It remains an open question whether this pattern would apply to, say, novices learning the Scratch programming language, and whether design modifications could help those novices better create more substantive artifacts. In another example of research in a more open-ended online learning environment, Sao Pedro and colleagues (2010, 2012) analyzed student experimentation behaviors in a physical science simulation environment, as mentioned above. Through a combination of human annotation of log files and the use of prediction modeling to develop automated detectors that could replicate the judgments being made by the human coders, they were able to identify whether students were demonstrating skill in designing sequences of experiments, and infer latent experimentation skill in those students. A third example can be found in work by Lynch and colleagues (2008) to classify the structure of students’ argumentation strategies. They used decision trees – a type of prediction modeling – to infer which attributes of students’ argumentation processes in an online legal reasoning system where students argue about U.S. Supreme Court cases were predictive of students’ eventual scores on a legal reasoning test. These specific environments were not constructionist. However, the move towards conducting EDM in more open-ended online learning environments, and the growth in understanding how to discover and exploit the structure in data from these environments, creates enabling conditions for extending these methods to constructionist learning. The process of extending EDM methods to constructionist data is not and will not be trivial; every new type of learning environment has required a learning process for EDM researchers. Typically, that learning process has involved a collaborative dialogue between experts in EDM and experts in the specific learning domain and online learning environment being studied. However, the successes in applying EDM methods to new domains and online learning environments gives hope that the process of extending EDM to constructionism will be quite tractable. That is not to say that EDM can or should tackle all research questions. Pure qualitative methods remain the standard for the exploration of possibilities, and pure quantitative methods remain the standard for confirmatory studies and larger scale hypothesis testing. However, EDM can provide a third way to reap many of the benefits from both more traditional qualitative and quantitative analyses. The move towards identifying student meta-cognition, and self-regulatory skill within structured learning environments, using EDM, is of potential value to researchers in the constructionist paradigm, where issues of learners learning to actively participate in and drive their own learning and complex performance is of strong interest. For instance, Jeong and colleagues (2010) have identified patterns of students’ transitions between problem-solving, self-assessment, and backtracking to reconsider previously learned material that distinguish between more successful and less successful learners. EDM enables rigorous, replicable, and precise description of learner behavior, as well as analysis of how those behaviors interact with other constructs of interest. Learner behavior can be tracked in how it grows and changes over time. This approach plays a key role in Jeong et al.’s (2010) research into students’ patterns of self-regulation over time. EDM methods have even been used to predict students’ preparation for future learning of new and different materials from other paradigms (cf. R. Baker, Gowda, & Corbett, 2011), providing a tool for linking analyses of learning within constructivist learning environments to a student’s learning progression. Generally, EDM methods allow for linking assessments of various aspects of student learning and learning processes to a range of other constructs; they also support the linking of various aspects of student process and learning to each other. These types of research fall squarely into the type of EDM research referred to as discovery with models, where EDM models of various constructs are studied in relationship to one another and to assessments of other

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology Learning to Analyze Research Trends in Learning Analytics Publications

In this paper, we show how ontology learning tools can be used to reveal (i) the central research topics that are tackled in the published literature on learning analytics and educational data mining; and (ii)relationships between these research topics and iii) (dis)similarities between learning analytics and educational data mining.

متن کامل

The Open University ’ s repository of research publications and other research outputs Learning analytics : drivers , developments and chal

Learning analytics is a significant area of technology-enhanced learning that has emerged during the last decade. This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings. It goes on to chart the emergence of learning analytics, including their origins in the 20th century, th...

متن کامل

Advances in Learning Analytics and Educational Data Mining

The growing interest in recent years towards Learning Analytics (LA) and Educational Data Mining (EDM) has enabled novel approaches and advancements in educational settings. The wide variety of research and practice in this context has enforced important possibilities and applications from adaptation and personalization of Technology Enhanced Learning (TEL) systems to improvement of instruction...

متن کامل

Learning analytics: drivers, developments and challenges

Learning analytics is a significant area of technology-enhanced learning that has emerged during the last decade. This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings. It goes on to chart the emergence of learning analytics, including their origins in the 20th century, th...

متن کامل

Linked Data based applications for Learning Analytics Research: faceted searches, enriched contexts, graph browsing and dynamic graphic visualisation of data

We present a case of exploitation of Linked Data about learning analytics research through innovative end-user applications built on GNOSS, a semantic and social software platform. It allows users to find and discover knowledge from two datasets, Learning Analytics Knowledge (LAK) and Educational Data Mining (EDM), and also reach some related external information thanks to the correlation with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Technology, Knowledge and Learning

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2014